Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Update bandit_envs_recommender_system #22421

Merged
merged 8 commits into from
Feb 24, 2022

Conversation

gjoliver
Copy link
Member

@gjoliver gjoliver commented Feb 16, 2022

Why are these changes needed?

Update bandit_envs_recommender_system with Sven's in-house implementa…, which supports multiple users and slate sizes.
Make sure example tune_lin_ucb_train_recommendation.py nails this environment.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(



class ParametricItemRecoEnv(gym.Env):
"""A recommendation environment which generates items with visible features
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we remove this? I feel like we should explain, why it's called parametric. Or re-name it, as any of our recomm envs is somewhat parametric, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh good catch. yeah, I should have copied over the description.
done.

def render(self, mode="human"):
raise NotImplementedError

# Because ParametricRecSys follows RecSim's API, we have to wrap it before
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this here?
Maybe just move this into the block below and register it under "ParametricRecSysEnvForBandits" or something like this.
Also, in the example script that actually uses this env, we can register it to show, how this is done.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea of registering the env in example script. good demo.
thanks.

Copy link
Contributor

@sven1977 sven1977 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for the cleanup @gjoliver ! :) Left only some minor nits.

@sven1977 sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Feb 17, 2022
@sven1977 sven1977 self-assigned this Feb 17, 2022
Copy link
Member Author

@gjoliver gjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. thanks for the review.



class ParametricItemRecoEnv(gym.Env):
"""A recommendation environment which generates items with visible features
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh good catch. yeah, I should have copied over the description.
done.

def render(self, mode="human"):
raise NotImplementedError

# Because ParametricRecSys follows RecSim's API, we have to wrap it before
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea of registering the env in example script. good demo.
thanks.

@sven1977
Copy link
Contributor

Hey @gjoliver , could you re-merge? Lots of tests failing, I'm thinking b/c we are too far behind master.

Jun Gong added 5 commits February 22, 2022 11:08
…tion, which supports multiple users and slate sizes.

Make sure example tune_lin_ucb_train_recommendation.py nails this environment.
(
"item",
gym.spaces.Box(
low=-np.ones((num_items, embedding_dim)),
high=np.ones((num_items, embedding_dim)),
low=-1.0, high=1.0, shape=(num_items, embedding_dim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Contributor

@sven1977 sven1977 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes @gjoliver . Great PR! :)

@sven1977 sven1977 merged commit a385c9b into ray-project:master Feb 24, 2022
simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Feb 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants